105 research outputs found

    Balanced boosting with parallel perceptrons

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/11494669_26Proceedings of 8th International Work-Conference on Artificial Neural Networks, IWANN 2005, Vilanova i la Geltrú, Barcelona, Spain, June 8-10, 2005.Boosting constructs a weighted classifier out of possibly weak learners by successively concentrating on those patterns harder to classify. While giving excellent results in many problems, its performance can deteriorate in the presence of patterns with incorrect labels. In this work we shall use parallel perceptrons (PP), a novel approach to the classical committee machines, to detect whether a pattern’s label may not be correct and also whether it is redundant in the sense of being well represented in the training sample by many other similar patterns. Among other things, PP allow to naturally define margins for hidden unit activations, that we shall use to define the above pattern types. This pattern type classification allows a more nuanced approach to boosting. In particular, the procedure we shall propose, balanced boosting, uses it to modify boosting distribution updates. As we shall illustrate numerically, balanced boosting gives very good results on relatively hard classification problems, particularly in some that present a marked imbalance between class sizes.With partial support of Spain’s CICyT, TIC 01–572

    Parallel Perceptrons, Activation Margins and Imbalanced Training Set Pruning

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/11492542_6Proceedings of Second Iberian Conference, IbPRIA 2005, Estoril, Portugal, June 7-9, 2005, Part IIA natural way to deal with training samples in imbalanced class problems is to prune them removing redundant patterns, easy to classify and probably over represented, and label noisy patterns that belonging to one class are labelled as members of another. This allows classifier construction to focus on borderline patterns, likely to be the most informative ones. To appropriately define the above subsets, in this work we will use as base classifiers the so–called parallel perceptrons, a novel approach to committee machine training that allows, among other things, to naturally define margins for hidden unit activations. We shall use these margins to define the above pattern types and to iteratively perform subsample selections in an initial training set that enhance classification accuracy and allow for a balanced classifier performance even when class sizes are greatly different.With partial support of Spain’s CICyT, TIC 01–572, TIN2004–0767

    Effectiveness evaluation of data mining based IDS

    Get PDF
    Proceeding of: 6th Industrial Conference on Data Mining, ICDM 2006, Leipzig, Germany, July 14-15, 2006.Data mining has been widely applied to the problem of Intrusion Detection in computer networks. However, the misconception of the underlying problem has led to out of context results. This paper shows that factors such as the probability of intrusion and the costs of responding to detected intrusions must be taken into account in order to compare the effectiveness of machine learning algorithms over the intrusion detection domain. Furthermore, we show the advantages of combining different detection techniques. Results regarding the well known 1999 KDD dataset are shown.Publicad

    A value-based healthcare approach: Health-related quality of life and psychosocial functioning in women with Turner syndrome

    Get PDF
    Objective: As part of the value-based healthcare programme in our hospital, a set of patient-reported outcome measures was developed together with patients and implemented in the dedicated Turner Syndrome (TS) outpatient clinic. This study aims to investigate different aspects of health-related quality of life (HR-QoL) and psychosocial functioning in women with TS in order to establish new possible targets for therapy. Design/Participants: A comprehensive set of questionnaires (EQ-5D, PSS-10, CIS-20, Ferti-QoL, FSFI) was developed and used to capture different aspects of HR-QoL and psychosocial functioning in a large cohort of adult women with Turner syndrome. All consecutive women, ≥18 years, who visited the outpatient clinic of our tertiary centre were eligible for inclusion. Results: Of the eligible 201 women who were invited to participate, 177 women (age 34 ± 12 years, mean ± SD) completed at least one of the validated questionnaires (88%). Women with TS reported a lower health-related quality of life (EQ-5D: 0.857 vs 0.892, P =.003), perceived more stress (PSS-10:14.7 vs 13.3; P =.012) and experienced increased fatigue (CIS-20: P <.001) compared to the general Dutch population. A relationship between noncardiac comorbidities (eg diabetes, orthopaedic complaints) and HR-QoL was found (R =.508). Conclusions: We showed that TS women suffer from impaired HR-QoL, more perceived stress and increased fatigue compared to healthy controls. A relationship between noncardiac comorbidities and HR-QoL was found. Especially perceived stress and increased fatigue can be considered targets for improvement of HR-QoL in TS women

    A note on ROC analysis and non-parametric estimate of sensitivity

    Full text link
    In the signal detection paradigm, the non-parametric index of sensitivity A ′, as first introduced by Pollack and Norman (1964), is a popular alternative to the more traditional d ′ measure of sensitivity. Smith (1995) clarified a confusion about the interpretation of A ′ in relation to the area beneath proper receiver operating characteristic (ROC) curves, and provided a formula (which he called A ′′) for this commonly held interpretation. However, he made an error in his calculations. Here, we rectify this error by providing the correct formula (which we call A ) and compare the discrepancy that would have resulted. The corresponding measure for bias b is also provided. Since all such calculations apply to “proper” ROC curves with non-decreasing slopes, we also prove, as a separate result, the slope-monotonicity of ROC curves generated by likelihood-ratio criterion.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45761/1/11336_2003_Article_1119.pd

    Distinct clinical symptom patterns in patients hospitalised with COVID-19 in an analysis of 59,011 patients in the ISARIC-4C study

    Get PDF
    COVID-19 is clinically characterised by fever, cough, and dyspnoea. Symptoms affecting other organ systems have been reported. However, it is the clinical associations of different patterns of symptoms which influence diagnostic and therapeutic decision-making. In this study, we applied clustering techniques to a large prospective cohort of hospitalised patients with COVID-19 to identify clinically meaningful sub-phenotypes. We obtained structured clinical data on 59,011 patients in the UK (the ISARIC Coronavirus Clinical Characterisation Consortium, 4C) and used a principled, unsupervised clustering approach to partition the first 25,477 cases according to symptoms reported at recruitment. We validated our findings in a second group of 33,534 cases recruited to ISARIC-4C, and in 4,445 cases recruited to a separate study of community cases. Unsupervised clustering identified distinct sub-phenotypes. First, a core symptom set of fever, cough, and dyspnoea, which co-occurred with additional symptoms in three further patterns: fatigue and confusion, diarrhoea and vomiting, or productive cough. Presentations with a single reported symptom of dyspnoea or confusion were also identified, alongside a sub-phenotype of patients reporting few or no symptoms. Patients presenting with gastrointestinal symptoms were more commonly female, had a longer duration of symptoms before presentation, and had lower 30-day mortality. Patients presenting with confusion, with or without core symptoms, were older and had a higher unadjusted mortality. Symptom sub-phenotypes were highly consistent in replication analysis within the ISARIC-4C study. Similar patterns were externally verified in patients from a study of self-reported symptoms of mild disease. The large scale of the ISARIC-4C study enabled robust, granular discovery and replication. Clinical interpretation is necessary to determine which of these observations have practical utility. We propose that four sub-phenotypes are usefully distinct from the core symptom group: gastro-intestinal disease, productive cough, confusion, and pauci-symptomatic presentations. Importantly, each is associated with an in-hospital mortality which differs from that of patients with core symptoms
    • …
    corecore